50 research outputs found

    Optimal rates and adaptation in the single-index model using aggregation

    Get PDF
    We want to recover the regression function in the single-index model. Using an aggregation algorithm with local polynomial estimators, we answer in particular to the second part of Question~2 from Stone (1982) on the optimal convergence rate. The procedure constructed here has strong adaptation properties: it adapts both to the smoothness of the link function and to the unknown index. Moreover, the procedure locally adapts to the distribution of the design. We propose new upper bounds for the local polynomial estimator (which are results of independent interest) that allows a fairly general design. The behavior of this algorithm is studied through numerical simulations. In particular, we show empirically that it improves strongly over empirical risk minimization.Comment: 36 page

    Nonparametric regression with martingale increment errors

    Get PDF
    We consider the problem of adaptive estimation of the regression function in a framework where we replace ergodicity assumptions (such as independence or mixing) by another structural assumption on the model. Namely, we propose adaptive upper bounds for kernel estimators with data-driven bandwidth (Lepski's selection rule) in a regression model where the noise is an increment of martingale. It includes, as very particular cases, the usual i.i.d. regression and auto-regressive models. The cornerstone tool for this study is a new result for self-normalized martingales, called ``stability'', which is of independent interest. In a first part, we only use the martingale increment structure of the noise. We give an adaptive upper bound using a random rate, that involves the occupation time near the estimation point. Thanks to this approach, the theoretical study of the statistical procedure is disconnected from usual ergodicity properties like mixing. Then, in a second part, we make a link with the usual minimax theory of deterministic rates. Under a beta-mixing assumption on the covariates process, we prove that the random rate considered in the first part is equivalent, with large probability, to a deterministic rate which is the usual minimax adaptive one

    Robust Methods for High-Dimensional Linear Learning

    Full text link
    We propose statistically robust and computationally efficient linear learning methods in the high-dimensional batch setting, where the number of features dd may exceed the sample size nn. We employ, in a generic learning setting, two algorithms depending on whether the considered loss function is gradient-Lipschitz or not. Then, we instantiate our framework on several applications including vanilla sparse, group-sparse and low-rank matrix recovery. This leads, for each application, to efficient and robust learning algorithms, that reach near-optimal estimation rates under heavy-tailed distributions and the presence of outliers. For vanilla ss-sparsity, we are able to reach the slog(d)/ns\log (d)/n rate under heavy-tails and η\eta-corruption, at a computational cost comparable to that of non-robust analogs. We provide an efficient implementation of our algorithms in an open-source Python\mathtt{Python} library called linlearn\mathtt{linlearn}, by means of which we carry out numerical experiments which confirm our theoretical findings together with a comparison to other recent approaches proposed in the literature.Comment: accepted versio
    corecore